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ABSTRACT 


This thesis analyzes engineering readiness and training onboard United States 
Navy surface ships. On the west coast, the major contributor to training is the Afloat 
Training Group, Pacific (ATGPAC). The primary objective is to determine whether the 
readiness standards provide pertinent insight to the surface force Commander and 
generate alternatives that may assist in better characterization of force-wide engineering 
readiness. 

The Type Commander has many questions that should be answered. Some of 
these are addressed with Poisson and binomial models. The results include: first, age of a 
ship has no association with performance of drills and that the number of discrepancies is 
associated with the performance of drills; second, drill performance decreased from the 
first initial assessment (IA) to the second IA; third, on average, the number of material 
discrepancies decreases from the IA to the underway demonstration (UD) for ships 
observed over two cycles; fourth, good ships do well on four programs; finally, training is 
effective. 

A table characterizing ships as above average, average, or below average in drill 


effectiveness at the IA and UD is supplied. 
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EXECUTIVE SUMMARY 


It is important to the U.S. Navy that its ships be properly trained and prepared for 
battle. It is equally important that the ships be prepared for a successful deployment. 
Upon the conclusion of deployment, ships undergo the Interdeployment Training Cycle 
(IDTC), with the aim for preparing the ship for the next deployment. 

The most important task that the engineering department faces is the underway 
demonstration (UD). Upon the successful completion of the underway demonstration, 
the engineering department is determined to be at an acceptable level for unrestricted 
engineering operations. 

The Afloat Training Group (ATG) conducts the evaluation of engineering 
operations. ATG conducts a primary evaluation at the initial assessment (IA) and the 
final evaluation at the underway demonstration. During both observations, ATG grades 
the engineering programs, notes material discrepancies, and scores watchstanders’ 
performance in drills. This data is used to answer several of Type Commander’s 
questions and to estimate ships’ performance. 

Using data collected over a five-year period from forty-four different ship cycles, 
numerous models, using Poisson and binomial distributions, were designed to describe 
the data. The conclusion are, first, that the age of a ship has no association with 
performance of drills but that the number of material discrepancies is associated with 
performance of drills; second, ships performed better in 1999 and 2000 than they did in 
2001 and 2002. Third, from the IA to the UD, on average, the number of material 
discrepancies decreases. Fourth, ships that performed well at the underway 
demonstration had the common feature of doing well on four programs at the initial 
assessment: bearing records, legal records, quality assurance, and operating logs. 

Descriptive models were produced that estimate the performance of a ship at the 
UD from results found at the IA. Quantiles of the data are used to determine standard for 
describing a ship as above average, average, or below average. Finally, the most 


important result is that ATG training seems to be effective. 
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I. INTRODUCTION 


A. OVERVIEW 


The Chief of Naval Operations (CNO) establishes the United States Navy’s goal 
of readiness, stating, “Our success in manpower and current readiness sets the stage for 
focusing additional resources on future readiness” (CNO, 2003). Specifically with 
regard to engineering and propulsion plant operations, the CNO directs that “the 
continuum of engineering training readiness is achieved through the establishment of the 
proper training objectives, by defining a training process which supports the attainment 
of those objectives, and by ensuring the process contains all keys for success” 
(Instruction 3540.8, p2). This guidance is then decomposed into functions and objectives 
at subordinate levels within the Navy. The Combined Fleet Forces Command has 
specified a more concise guidance: “..to provide the policy and minimum 
COMNAVSURFOR requirements to assist the ISIC [Immediate Superior In Command] 
and commanding officer to develop a comprehensive Basic Phase training program that 
integrates a sequence of individual, team, and unit training evolutions in all mission areas 


and core competencies applicable to the Naval Surface Force” (CFFC 3502.1A, p1-1-2). 


The Commander of the Atlantic Fleet sets engineering training and readiness 
guidance for conventionally powered surface ships and aircraft carriers, stating that 
“while the attainment of objectives takes the Engineering Department to the proper level 
of readiness, the continuum requires the level to be maintained, i.e., engineering training 
is a continuing process after the initial attainment of objectives” (Instruction 3540.8, p2). 


In this way, the surface Navy has an instruction manual for training guidance. 


Regarding the continuum of engineering training readiness, the Fleet Commander 
established performance criteria, defined a training process, and identified functions 


needed for success. Specifically, the engineering training program is designed to: 


e Achieve and maintain systems and operating qualifications in propulsion, 


electrical and auxiliary watchstanders; 


e Develop and maintain management, systems and watchstanding expertise 


in department supervisory personnel; 


e Ensure continual readiness of engineering personnel to support routine 
propulsion, electrical and auxiliary plant operations in all conditions of 


readiness; 


e Ensure continual readiness to effectively control propulsion and electrical 


plant casualties in all conditions of readiness. 


In short, the Fleet Commander sets goals for the training cycle and specifies functions in 


preparation for the training process (Instruction 3540.8, p2). 


The Commander of Naval Surface Force (COMNAVSURFOR) oversees the 
conduct of the surface force training programs for all surface ships and units. This is the 
primary stakeholder for Fleet engineering readiness. COMNAVSURFOR establishes the 
primary source of policy, direction, and requirements for all aspects of basic phase 
training (CFFC 3502.1A, p2-1-1). It is this perspective on readiness, for material 
condition, operator performance, and shipboard safety, which forms the focus of this 
research. 


B. INTERDEPLOYMENT TRAINING CYCLE 


Almost all surface ships go through a five-phase interdeployment training cycle 
(IDTC) that can take two years to complete and that culminates in a ship being ready for 
overseas deployment (CFFC 3502.1A, p2-1-1). Four of the phases, or quadrants, are 
typically completed in homeport and surrounding waters and the fifth stage represents the 
overall goal of mission readiness for forward deployment. The primary goal of the 
training program is to build fully proficient watchteams for shipboard operations. The 
training program also seeks to fix the watchteams’ composition for the duration of each 
quadrant so that team progress can be measured and meaningful, and to avoid disruption 
due to the turnover of individual watch personnel, which may inhibit progress for the 
team as a whole. To ensure that ships are consistently prepared to conduct effective 


operations, COMNAVSURFOR has organized a regimen of ship assessments. 


During deployment, the ship conducts its own Command Assessment of 
Readiness and Training (CART I). Occurring at the mid-point of the deployment, the 
ship assesses its operational proficiency and identifies requirements to start the first four 


phases when returning from deployment. Specifically, the ship scrutinizes the following 


areas: 
e Formal school training status and needs; 
e Basic phase elements for material readiness; 
e Potential specific training requirements; 
e Any systems additions or modifications that will take place upon return 


from deployment; 
e Material and equipment assessment to determine equipment condition; 
e Issues that may impact subsequent training. 


The process generates a ship’s training plan for the IDTC based on this assessment 


(CFFC 3502.1A, p2-2-1). 


As a ship returns from deployment, the first phase in the training cycle 
(maintenance) begins. The condition of sensors, weapon systems, and equipment 
determines whether there are to be changes or upgrades to the current configuration. 
Reviews of maintainability and operability use Preventive Maintenance System (PMS), 
combat system checkout, system testing, or by conducting safety and material 
inspections. During the maintenance stage, the ship may request outside assistance with 
shipboard training, necessitated by the presence of updated equipment on which 
personnel are not trained or by new personnel reporting to the ship without prior 
experience. After the ship is back in working order and all installations are in place, the 
ship progresses into the next three training phases: basic, intermediate, and advanced 


(CFFC 3502.1A, p2-1-2). 


During the basic phase of training, the ship’s ISIC monitors progress, provides 
overall supervision throughout the training cycle and participates in selected evolutions. 


The ISIC continually reviews all afloat programs and activities within its command and is 


responsible for relaying the report summaries to the ship’s commanding officer 
(Instruction 1710.16, p5). The ship and the ISIC conduct a Command Assessment of 
Readiness and Training (CART II). The formal review of shipboard engineering is called 
the initial assessment (IA), which yields a report of the evaluation. From this report, a 


Tailored Ship’s Training Availability (TSTA) syllabus is developed. 


TSTA is a regularly scheduled training program that is planned during the basic 
phase of training (CFFC 3502.1A, p2-1-1). Toward the end of the basic phase, assessors 
conduct an underway demonstration (UD), and ascertain whether the ship’s engineering 
Condition III watchteam, the standard operating watchteam, is at an acceptable level for 
unrestricted engineering operations. | Engineers continue to train their Condition I 
watchteam, the warfighting watchteam, in preparation for the Integrated Training Week 
and Final Evaluation Period (FEP). The IA and UD processes and their outputs are 


central to this current study. 


When the ship and the ISIC require outside assistance, either can request 
assistance from the Afloat Training Group (ATG). ATG “...provides the ISIC and the 
afloat commanding officer with unit-level training for the command’s training teams and 
watchstanders in order to accomplish the Interdeployment Training Cycle (IDTC) basic 
training phase objectives with a primary goal of training the ship’s crew to train 
themselves” (ATG, 2003). ATG can provide initial assistance during preparation for the 
basic phase of training, but predominantly focuses on the remainder of the training cycle. 
In the process of assistance offered by ATG, afloat training is ATG’s primary goal in the 
basic phase of training and assisting with supporting readiness needs at the request of the 
ship or the ISIC. A further explanation of the roles and responsibilities of ATG will be 
discussed in Chapter II. 


The overall purpose of CART II is for the ship to evaluate its level of proficiency 
upon completion of the basic phase (CFFC 3502.1A, p2-4-N-3). This level of 


proficiency should include: 
e Satisfaction of all “Ready to Train” criteria; 


e Completion or development of a plan to complete all required schooling; 


e Certification of “Unrestricted Operations” by: 
O Adequate propulsion operating equipment; 


O Status of all Items of Priority OP) or Repair Before Operating 
(RBO) of equipment. An IOP is an item that requires 
extraordinary outside assistance for repair, or where a ship class 
problem is suspected. An RBO is a significant deficiency that 
must be corrected prior to placing a piece of equipment, a system, 
or the entire propulsion plant in operation (Instruction 


3540.9CH7A); 


O Completion of UD with two qualified watchteams. A watchteam 


is the group of individuals that operates the engineering plant; 


e Achievement of Level B for both Watchstanders and Shipboard 


Engineering Training Team. 


Training levels reflect a combination of watchstander proficiency and the ability 
of the ship to sustain that training through its training team organization. The three 
training levels are denoted by A, B, and C. Proficiency level C is defined as 
“watchstanders assigned to all required watch stations but proficiency is weak.” Level B 
is defined as personnel being “...able to correctly perform routine duties commensurate 
with their rate/rating and watchstation with minimal prompting.” Proficiency level A is 
defined as the ability of “...watchstanders to consistently react correctly during sustained, 
stressful operations that involve transition to an increased level of readiness” (CFFC 
3502.1A, p2-3-2). A rate and rating are the Sailor’s job description within the ranking of 
the military. For example, a rating of EN3 identifies the Sailor as a junior grade 
engineman petty officer or an E-4, whereas, an EN1 identifies the Sailor as an upper- 


grade engineman petty officer or an E-6. 


ATG is the type commander’s “executive agent” for afloat training. The use of 
ATG during basic phase assures standardization in conducting and assessing training 


(CFFC 3502.1A, p1-2-2). For this purpose, the shipboard training team is trained by 


ATG personnel and the process is referred to as “train the trainer.” This enables ships to 
continue training their crew through the rest of the IDTC in the hopes of maintaining a 


high level of proficiency. 


After the basic phase of training is completed and the command is deemed ready 
to proceed with intermediate training, the ship commences training as an integrated part 
of a strike group. The Fleet Commander is then responsible for the intermediate training 
phase. This phase is focused on warfare team training and initial multi-unit operations. 
During this phase, ships develop warfare skills in coordination with other units while 
maintaining unit proficiency. Upon completion of the intermediate phase, the advanced 
phase of training allows continued development and refinement of integrated battle group 
warfare skills and command and control procedures needed to meet the major fleet 
commander’s specific mission requirements (CFFC 3502.1A, p2-1-2). 

C. ENGINEERING TRAINING PROCESS 


As resource providers to combatant commanders, the fleet commanders are 
responsible for providing the combat-trained, manned and equipped forces to the 
combatant commanders and thus have primary responsibility for the tactical training of 
naval forces (CFFC 3502.1A, p2-1-1). A subset of the exercises, engineering training 
and readiness, is assessed by the ISIC and ATG. The assessment focuses on material 
readiness, the proficiency level of engineering watch sections and training teams, the 
effectiveness of safety and management programs, and includes an evaluation of the crew 
in conducting a main space fire drill, considered to be the greatest and most complex 
threat during engineering plant operations. The JA report then assists the Commanding 
Officer in the development of engineering training objectives and a training plan for the 


Basic Phase of training. 


After periodic training, provided by ATG, a final and separate ATG team 
conducts the one-day underway demonstration (UD) of the ship. This final assessment of 
engineering determines the level of proficiency of the ship’s engineering crew. In 
summary, the goal is engineering readiness. The processes include review of material 
condition and administration but focus on engineering operations, evolutions, and drills, 


which will be discussed in Chapter II. Evolutions are tasks assigned to an individual 
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watchstander that are performed routinely. Drills are the practice of performing the 
approved procedure in prevention of equipment damage during a casualty. Assessors 
evaluate the number of satisfactorily completed evolutions or drills tasked per watchteam 
versus the evolutions or drills that were not or could not be completed. The objectives of 
the process are maximal safety, maximal watchstander performance and minimal 
degradation to the engineering plant. The measures of effectiveness (MOE’s) involved 
with these objectives, in particular those associated with watchstander performance, are 
the focus of this thesis. 

D. METHODOLOGY 


The Commander, Naval Surface Force, has identified a specific objective for 
engineering training. This objective is to maximize the proficiency of watchstanders to 
perform tasks and duties, and the ability of the ship to sustain that training on its own 
(CFFC 3502.1A, p2-3-2). The current measure of effectiveness is defined for two watch 
sections and the engineering training team being assessed in evolution and drill 


effectiveness. Specifics of the current measures will be described in Chapter II. 


This current analysis will evaluate whether the current MOE’s for shipboard 
engineering training and readiness process adequately reflect the stated objectives. After 
describing the objectives, the measures of effectiveness will be discussed, using data 


analysis from four years of assessments. 


Over 5000 pieces of nonnumeric data are available from a four-year period, 
including IA, work-up cycles, and UD activities. Using parametric and non-parametric 
techniques, the process models material condition levels using measures of program 
efficiency, counts of material discrepancies, and watchstander proficiency as measured 
by drill efficiency levels. The thesis closes with a proposed model for assessing fleet 


engineering readiness levels. 
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Il. ASSESSING ENGINEERING READINESS AND TRAINING 


A. WHY ATG EXISTS 


Afloat Training Group (ATG), formerly known as the Afloat Training 
Organization, was established during World War II. Its primary mission was to provide 
shore-based and underway training to ships’ crews. Operational Training Command was 
created to lead the Training Group to verify that each ship was properly outfitted, 
manned, and capable of fighting and sustaining the casualties of war. During World War 
II, the creation of the Training Group provided lessons learned for improvements in the 


design, construction, and operation of new warships. 


In the 1960’s, the Training Group conducted intensive periods of underway 
refresher training for surface ships and, in conjunction with the Board of Inspection and 
Survey (INSURV), certified ships’ seaworthiness and readiness to operate. The 
Propulsion Examination Board (PEB) was created in the 1970’s due to, in general, 
engineering plant problems, and specifically, the frequency of main space fires that 


occurred in the 1970’s. 


In the early 1990s, the Navy adopted a tactical training process, designed to 
achieve combat readiness and interoperability, and formed Afloat Training Organization 
(ATO) to provide oversight and training. ATG sees its mission as providing “...unit-level 
training for the command’s training teams and watchstanders in order to accomplish the 
Interdeployment Training Cycle (IDTC) basic training phase objectives with a primary 
goal of training the ship’s crew to train themselves” (ATG, 2003). 


ATG functions as the senior assessor during the IDTC basic training phase, 
provides assistance in maintaining afloat unit-level proficiency through the use of the 
Limited Training Team (LTT), and serves as the executive agent for 
COMNAVSURFFOR in conducting assessments, while retaining the role of guiding the 
process. As senior assessor, ATG is responsible for the overall assessment of a ship, 
acting as final authority on all matters relating to the assessment and certification 


execution and reporting, and providing final approval of major deficiency corrections. 
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ATG keeps the Commanding Officer informed of the assessment, and observes the 
Engineering Officer of the Watch (EOOW) during operational evaluation. The EOOW is 
“...the officer on watch in charge of the main propulsion plant of the ship, and of the 
associated auxiliaries...responsible for the safe and proper operation of such units, and 
for the performance of the duties prescribed in these regulations and by other competent 


authority” (Navy Regulations, Chapter 10). 


AS space assessors, ATG members assess material conditions of the space, 
observe and contribute to the evaluation of watchstander performance during casualty 
control and routine plant operations, and assess selected administrative management 
programs. The reason for review of these management programs is to gain insight into 


material processes in the engineering plant (Manual 3540.9Ch2A). These programs 


include: 
e Safety programs: electrical, system isolation, also known as tagout, heat 
exposure, and hearing conservation procedures. 
e Administrative records: legal records and operating logs. These two 


records can demonstrate the material condition of the ship; if the 


paperwork is thorough, the in-space work should be completed. 


e Lube Oil Quality Management (LOQM). Assessors determine 
watchstander knowledge in lube-oil sampling requirements and criteria. 
When LOQM is not properly maintained, typically, the personnel do not 
take the routine lube oil samples properly. This could lead to breakdown 


of equipment from excessive wear and tear. 


e Fuel Oil Quality Management (FOQM). Inspectors sample and test fuel 
oil service tanks and transfer of fuel oil, examine the oil spill kit and 
required fuel testing equipment, and determine the qualification of the 
personnel conducting fuel oil testing. When FOQM program is poor, clear 
fuel may not be burning in the engines. As in the case of LOQM, dirty 
lube oil can lead to excessive wear and tear of the equipment, giving a 


shorter life. 
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e Marine Gas Turbine Equipment Service Records. Reviewers verify that 
all required technical directives are incorporated, that selected component 
serial numbers logged and match installed components, and that unusual 
events and maintenance of engines are annotated in appropriate sections of 


the logs. This indirectly reflects gas turbine material conditions. 


e Bearing Records. Ship's documentation reports the maintenance and 
current condition of the ship's bearings. Again, this is a direct indicator of 


material condition. 


e Quality Assurance. Meaningful shipboard and work center audits are 
conducted and reports are submitted to the chain of command. When a 
ship performs material maintenance actions, a quality assurance report is 


produced to ensure the work is completed properly. 


Finally, ATG is responsible for the construction of the evolution and drill plan by 
which the ship watchstanders are assessed through coordination with the ship’s 
Engineering Training Team leader. The production of a thorough assessment is expected 
to provide objective results. These assessments serve as the source of the data analyzed 
in this thesis. 


B. TYPE COMMANDER OBJECTIVES 


Aligned with the goals of safety and readiness, the type commander has three 
primary objectives for all surface ship engineering readiness. These objectives are to 
maximize safety, minimize plant degradation, and maximize watchstanders’ 
performance. 


1. Maximize Safety 


The most dangerous event, one that can cost money and lives, is a main space fire. 
Approximately 75% of all shipboard fires occur in the engineroom (DCIN #4). In the 
1970’s, there were thirty reported main space fires that accounted for five deaths. In the 
period 1980-1993, there were 387 reported main space fires with only four deaths. Since 
1993, there have been twenty-seven main space fires with only one death (Naval Safety 


Center, request no. 07). The reason for the larger number of occurrences in the 1980’s 
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and early 1990’s may have been more stringent reporting criteria. Although the reporting 
criteria have remained consistent, the number of fires has dropped significantly over the 


past ten years. 


Within a ship’s design, there are four major compartments that comprise the 
engineering spaces. On cruisers and destroyers, there are two compartments serving as 
the main engine rooms and two as auxiliary rooms, whereas a frigate has one main engine 
room and three auxiliary rooms. The combination of potentially dangerous electrical and 
machinery operations, extreme heat conditions, and presence of flammable material make 


these spaces the focus of intense scrutiny for safe operations. 


The main engine room has equipment that controls the propulsion of the ship. 
This includes the engines and the main reduction gear. The engines for this study are gas 
turbine engines, LM-2500, that have sixteen stages of air compression, a controlled 
combustion of air and fuel mixture, and an expansion stage for moving the high speed 
flexible coupling shaft that is connected to the main reduction gear. The power that is 
produced rotates the main reduction gear, which is connected to the shaft for propulsion 


through the water. 


Each engine room holds two gas turbine engines connected to the reduction gear. 
Each engine can burn approximately one thousand gallons of fuel per hour. The 
reduction gear has a sump for lube oil storage in excess of fifteen hundred gallons. These 
flammable liquids have a potential of creating a main space fire. The auxiliary room 
houses the fuel transfer pumps and fuel service tanks. These two components are what 
feed fuel to the main engines. Each room contains two service tanks, each holding five 


thousand gallons of fuel. As in the engine room, a fire in this space could be disastrous. 


If any one of these spaces experiences fire, it is classified and reported as a main 
space fire. Ships’ crews routinely spend time preparing to fight a main space fire. A fire 
causes a ship to go to its highest level of readiness and a fire is an entire ship’s problem, 
not just a problem for engineering. The ship is its own fire department. Containing a 


main space fire requires the effort of the entire crew, and the entire ship trains together 
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for this event. During the basic phase of training, each ship is observed fighting a 
simulated main space fire until the ISIC is satisfied with its efforts and deems it proficient 
in the handling of this crucial event. 


2. Minimize Plant Degradation 


Ships need to be in a condition of material readiness to support underway 
operations. “The material condition of the ship's propulsion plant [is] assessed to validate 
the ship's material self-assessment and verify readiness during the assessment and 
certification process” (Manual 3540.9, p 3-1-1). A ship should be capable of self- 
assessment with its own Engineering Training Team, and have equipment necessary to 
operate, recover from equipment degradations, cover operational impact with written 


procedures, and maintain cleanliness, preservation, and proper stowage in the spaces. 


Each ship class has particular equipment. Equipment determined to be non- 
operational and listed as either an IOP or RBO, creates difficulties for a ship to get 


underway. Appendix A is a listing of the required equipment on board each ship class. 


An example of an item of priority is a lube oil leak on the casing of the main 
reduction gear. A reduction gear is required to be operable. This deficiency is a potential 
fire hazard, can increase the ship’s workload, and can decrease the service life of the ship. 
An example of repair before operate is welding slag on the deck of an inlet plenum of a 
gas turbine engine. Slag is a foreign object that can be sucked into the intake of the 
engine, causing catastrophic damage, and placing the engine out of commission until the 
problem is fixed. If the other engine on the same shaft has a similar problem, the ship 
remains at the pier until repairs are complete. 


3. Watchstanders’ Performance 


The type commander has directed assessment of the ability of watchstanders to 
carry out routine propulsion plant evolutions and of casualty control procedures 
(Unstruction 3540.9CHSA, p5-1-1). Evolutions can include observing a watchstander 
draw a lube oil sample, which the EOOW can also observe for cleanliness. This is an 
example of two watchstanders performing one daily task, or evolution. Another example 


is the gas turbine module inspection (GTMI), or an exterior inspection of the gas turbine 
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engine. This evolution requires permission of the EOOW, the watchstander conducting 
the evolution, and a safety observer. Although three people work the evolution, it is 
assessed as one evolution and the watchstander conducting the inspection is the one 
graded. These two examples demonstrate how an evolution can be performed by one 
person or several, that they are routine, that they need to be checked for accuracy, and 


that they reflect on watchstander performance. 


The process of conducting and practicing casualty control procedures is referred 
to as a drill. Drills are a team effort. For example, one drill is the gas turbine engine 
stall. An engine stall is when the air compression is interrupted. The engine could also 
have blade damage, causing it to be inoperable. During a real event or the drill, four 
watchstanders are involved. The EOOW has the big picture and oversees the other three 
watchstanders to insure that they properly conduct their steps. For the gas turbine stall, 
the EOOW orders the propulsion auxiliary control console (PACC) operator to stop the 
affected engine, and the two in-space watchstanders conduct a visual inspection of the 
engine. Upon controlling the casualty, the watchstanders conduct a GTMI to determine 
the extent of the damage. This illustration demonstrates the importance for the 
watchteam to be effective in conducting drills and evolutions as they can be separate 
events or be combined into a chain of events. A failure to act properly could result in the 
loss of the engine. If the engine needs to be replaced, the ship must return to port, order a 
new engine, and have outside assistance in the proper instillation. This requires 
numerous man-hours, the cost of the replacement engine, and several days in port. 


C. MEASURING ENGINEERING EFFECTIVENESS 


Metrics enable determination of whether objectives are met. Managing readiness 
requires the type commander to be able to determine if ships are safe, in good condition 
and ready for casualties. This is where ATG plays an important role. ATG is the 
principal resource in considering the TYCOM objectives and measuring how ships are 
meeting the goals. 


1. Assessing Safety 


ATG reviews firefighting capabilities to determine whether the ship has any 


safety issues. This is a bow-to-stern assessment of the ability of the crew. It includes a 
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review of “Main Space Fire Doctrine, evaluation of the material condition of damage 
control and emergency equipment, observation of main space fire drills, observation of 
flammable housekeeping practices, and observation of DCTT's ability to effectively train 
and evaluate” (Manual 3540.9, p 6-1-1). All guidance for review must comply with the 
type commander’s instruction and guidance in the Naval Shipboard Technical Manual 
(NSTM) 555. The manual outlines firefighting procedure, firefighting equipment, 


damage control systems, and evacuation and isolation of spaces. 


During initial assessment, ATG reviews safety conditions, repair locker inventory, 
and inspect firefighting equipment for completeness and operability. Also, an assessor 
conducts Halon checks in one or all engineering spaces for timing of the pressure 
switches and to verify that the horns and lighting are effective. After checking all 
equipment, ATG observes one main space fire drill, written by the project officer, but 
conducted by the ship training team, verify that space isolation is conducted in 
accordance with ship procedure and the NSTM 555. ATG also verifies the ship achieves 


complete space isolation. 


ATG observes initial in-space reactions to report, deflect, and isolate the space 
leak, repair locker activity, and communications with damage control center. All of the 
information is collected and shared with the observers who then determine the efficiency 


of the drill. ATG also comments on discrepancies found. 


When reviewing the data of main space fire drills, ATG has made several 
modifications of grading the drills in the past four years. At the end of 2000, a new point 
system check list was developed. Over the next year, the weights associated with each 
element of the watchstanders’ performance were changed. As of the end of 2001, the 
system had not been formalized. The end result was that the checklist was only a guide 
for assessing the drill. Some of the waterfront ISIC approved and others disagreed with 
the checklist. Overall, most main space fire drills were graded as not effective or 
partially effective. However, the ISIC usually assesses main space fire drills as effective 
during the basic phase of training (Mallette, 2004). Therefore, data on main space fire 


drills has been excluded. 
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2. Measuring Plant Degradation 


ATG reviews items from the ship’s self-assessment of material readiness by 
looking at the elements of degraded equipment list, departure from specification waiver 
list, and any IOP and RBO. “Material condition of the ship's propulsion plant will be 
assessed to validate the ship's material self-assessment and verify readiness during the 
assessment and certification process” (Manual 3540.9, p3-1-1). An additional item for 
review is the ship’s individual program listing. IOP’s and RBO’s are noted on the initial 
assessment report and the underway demonstration report. ATG also reviews 
administrative programs in full depth during the initial assessment, but not at the 


underway demonstration except at the request of the ISIC. 


One critical element that ships have to account for in determining plant 
degradation is the administrative programs. Satisfactory programs can be an indicator of 
engineering management and may provide insight into the quality of its condition. There 
are thirteen primary programs, of which an average of ten are assessed. It can be 
hypothesized that there is a correlation between program management and material 
condition. The proportion of effective or partially effective programs could reflect plant 


condition, and will be evaluated and further explained in Chapter HI. 


Counting the number of RBOs and IOPs should also be a factor of plant 
condition. From reviewing the data, an average of 1.50 RBOs and 1.78 IOPs per 
assessment is an indication that, on the average, about three pieces of equipment are non- 
operable. As noted in Appendix A, each ship can have some non-operable equipment, 
but too much prevents the ship from being certified for “unrestricted operations”. This 
will be further explained in Chapter HI. 


3. Watchstanders’ Performance 


ATG observes watchstanders’ evolutions in two ways during the initial 
assessment: during the material checks and during evolution tasking. During the 
underway demonstration, ATG observes evolutions during the tasking phase. After the 
conclusion of an evolution set, ATG observes the watchteam drills. Typically the two 


watchteams have eight drills each. Conduct of the evolution and drill set for the two 
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watchteams can take an entire day. When the day is finished, ATG has a great deal of 
data on the ship and will be able to make recommendations to the ISIC on the ship’s level 
of proficiency. ATG observes the watchstanders’ performance during casualty control 
drills as selected by the project officer. It then grades the drill as effective or not 


effective. 


The proportion of effective drills is a measure of watchstander performance. 
When a watchteam has a high proportion of effective drills, it should be deemed an 
effective team. The current threshold is to have a 50% effectiveness at the underway 
demonstration (CFFC 3502.1A, p2-4-N-2). A ship whose efficiency is 50% or higher 
passes the assessment and is declared to be satisfactory. Chapter III will go into further 
detail about measuring the current threshold. Table 1 summarizes the three objectives, 


how they are measured, and the numerical scale of the data observed. 











Objective Measure Scale 
Maximize Safety Main Space Fire Drills No Data 
Minimize Plant Degradation | RBO/IOP 0-N(Number of Equipment) 





Minimize Plant Degradation | Proportion of Sat Programs | 0-1 








Maximize Performance Proportion of Sat Drills 0-1 











Table 1. Summary Box. 


D. THE MODELS 


An ideal ship is safe, in good material condition, and exhibits strong watchstander 
performance. A ship’s crew should be well trained for fighting a main space fire. As 
discussed, there is no consistent data for measuring firefighting effectiveness. This is in 
part to scoring standards changing over recent years. Therefore, it is difficult to 


determine the level at which an ideal ship should be. 


Good material condition of the ship and proper program management implies 
minimal plant degradation. With reports that are produced from the initial assessment, 
the number of RBOs and IOPs were recorded and analyzed, and the condition of the 


ship’s programs determined. Watchstander performance, routines and casualty control, 
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are also important. Both the initial assessment and underway report record how well the 
watchteams performed. Data were collected and recorded for all drills during both 


assessments. 


Chapter III will discuss the models in further details. In this thesis, the second 


and third issues will be examined. 
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ll. THE MODELS 


A. IDEAL SHIP 


In Chapter II, the Type Commander’s goal of readiness was decomposed into 
material condition, watchstanders’ performance, and safety. This chapter seeks to 
translate these goals into a model to assess force-wide engineering readiness. The model 
is designed to describe the state of readiness and enable decision makers to gain insight 
into its distribution in the fleet. The principal indicators for this model are the data ATG 


collects on shipboard material condition and watchstander performance. 


Material condition is a vital component of keeping the ship operating. With the 
oceans being comprised of salt water, rust is a factor in engineering plant degradation. 
Sailors spend many hours preserving the ship and preventing rust. The engineering plant 
personnel must also properly maintain the equipment. Each department follows a 
maintenance schedule for preventing damage. Yet damage does occur through wear and 


tear and operator error. 


Preventative maintenance is conducted routinely. ATG observes the material 
condition of the ship in several ways: through programs, material inspections, and 
evolutions. During the initial assessment, ATG performs a review of the administrative 
programs. Each program has a listing of what is expected. This checklist is provided and 
evaluated by ATG. Each program is then graded as effective, partially effective, or not 


effective. The data from the ships’ initial assessments are available for evaluation. 


During the first day of the initial assessment, ATG performs material checks on 
the engineering equipment. The ship personnel follow written procedures and 
demonstrate the checks to the ATG assessor. When a piece of equipment fails a check, it 
can be categorized as an item of priority or repair before operating (IOP or RBO, 
respectively). After the three-day assessment, ATG includes in the final report the 
equipment that still falls in one of those two categories. During the underway 
demonstration, the equipment that did not pass during the IA will be given a cursory 


review to see if it complies. If the equipment is still not operating properly or other 
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equipment is discovered to be faulty, then it is listed on the final UD report in one of the 
two categories. These data are also available for the initial assessment and underway 


demonstration. 


For proper maneuvering of the ship, the engineering plant must be operationally 
sound. This includes the watchteams’ ability to prevent damage to equipment during a 
casualty. During the initial assessment and underway demonstration, ATG observes 
casualty control procedures. The ship’s engineering training team imposes drills that 
ATG grades. ATG grades a drill as either effective or not effective. If the watchteam 
commits a safety violation or takes casualty control steps out of sequence, the drill is 
scored as not effective. These data are consistent and on record, and serve as a basis for 


analyzing watchstander performance. This is not the case for firefighting. 


Because 75% of all fires occur in the engineering spaces, the engineering 
department is in charge of training for fires. To plan for firefighting training, the 
engineer officer coordinates main space fire drills. To run a full drill requires the entire 
ship’s crew. For every IDTC, a ship conducts several main space fire drills until the ISIC 
certifies the ship. During the initial assessment, ATG monitors and evaluates one of 
these drills. By the time of underway demonstration, these drills are typically assessed as 
effective. This provides limited insight into safety of the ship and therefore the data was 
not descriptive. Because of the changes in grading criteria cited in Chapter I, and the 
relative lack of quantization involved, main space firefighting data are not being analyzed 
for the current study. 


B. DISTRIBUTION 


When creating models to describe a future outcome, the distribution picked is the 
largest determining factor. In this thesis, several models were created and examined for 
determining best fit. For the four descriptive models, the binomial and Poisson 
distribution were best fitting. 


1. Binomial Distribution 


The binomial distribution is best used for observations that take on one of only 


two values, success or failure. This is the case for analyzing drills, which are either 


20 


effective or not effective. When using the binomial distribution for testing, there are two 
major assumptions that need to be made: first, that trials, or drills, are mutually 
independent; and second, that each trial has a probability of resulting in “effective” and 
this probability is the same for all trials in a particular set (Conover, p124). In this case 
the assumptions imply that a team’s success or failure on a particular drill is unaffected 
by its performance on earlier drills, and that each drill is equally likely to be rated 
“effective”? for a particular ship in a particular stage (IA or UD). 
That is, we assume watchstanders’ performance on a drill is not degraded if the preceding 
drill was graded “ineffective,” and that there are no “hard” or “easy” drills among the 
ones we consider. Shipboard training teams seek to isolate watchstander performance 
among drills, so this appears to be a reasonable starting point. 


2. Poisson Distribution 


The Poisson distribution is a standard model for processes that produce integers. 
The process is a simple parametric model that is commonly used for the analysis of 
certain kinds of recurrence data. An integer-valued process that has a range from zero to 
infinity has three conditions that must be satisfied: first, the number of recurrences must 
be independent; second, the recurrence rate is positive; and third, the recurrence of zero 
must equal zero (Meeker and Escobar, page 406). A very important application arises in 
connection with the occurrence of events of a particular type over time (Devore, page 


134). 


The Poisson is a reasonable model for material discrepancies. The model assumes 
that the presence or absence of a particular material discrepancy does not cause nor 
inhibit any other discrepancy. However, this not necessarily realistic. For example, all 
ships have two high pressure air compressors (HPAC). If one is down, there is a greater 
likelihood that the ship will do its best to ensure the other is operational, therefore, 
indicating dependency. That said, there are numerous configuration and redundancy 
which minimizes dependence among different types of equipment, such as a HPAC and 


an engine. Thus the independence assumption is reasonable. 


21 


C; HYPOTHESES 


There are several hypotheses that emerge from the Type Commander’s 
perspective on force-wide readiness. The first question that is proposed is the importance 
of individual ship classes. It is reasonable to aggregate engineering readiness data from 
all combatants? This is important for developing insight across the fleet. It also enables 
ATG to assess all training teams similarly because it leads to the understanding that each 
ship should be assessed at the same level as opposed to treating ships from different 
classes differently. To answer this question, we test the hypothesis that the data can be 


viewed as coming from one group. 


The second question reflects on the age of the ship. Do older ships tend, on 
average, to have more material discrepancies? The null hypothesis for this question is 
that the age of the ship has no effect on the number of RBOs and IOPs during an 
assessment. A failure to reject this null hypothesis means that regardless of age, ships 


may be evaluated to the same standard. 


Readiness is not static; therefore, not only the current assessment cycle is of 
interest, but also how commands do one cycle to the next. There are three ship cycle 
trends that develop the third null hypothesis. The first trend is how the force does from 
one year to the next. These data can be broken into the individual years and a 
comparison can be done from year to year. The second trend is how a ship performs 
from one interdeployment training cycle to the next. This involves comparison from an 
individual ships’ first cycle to its second cycle. This applies to data available for fifteen 


ships during deployment cycles between 2000 and 2003. 


The third trend is whether the relationship of age of the ship has any effects on the 
number of material discrepancies; one may want to know if a ship is having an increase 
in material discrepancies by another unknown factor. This is accomplished by looking at 
the material condition and ship’s age across deployment cycles. These three trends 
determine the third null hypothesis, which is that there is no change in the number of 
RBOs and IOPs from one ship cycle to the next. This analysis will detect whether ships’ 


material condition is increasing, decreasing, or remaining the same. 
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The final insight among different interdeployment training cycles is the 
performance of watchstanders in drill performance. Drills are assessed at both the initial 
assessment and the underway demonstration. If there is a change from one initial 
assessment to the next initial assessment, this could suggest that ships are either better or 
worse prepared for the start of each basic phase of training. In addition, the underway 
demonstration observation could explain how well ships respond to training during the 
basic phase. Therefore, the forth null hypothesis is that there is no change in the drill 


performance of drill between ship cycles. 


Finally, engineering departments spend numerous hours with administrative 
paperwork. There are sixteen programs that are maintained and assessed. Is there any 
insight regarding how these programs reflect the overall readiness of a ship? The final 
null hypothesis is that programs are no help in determining a ship’s material condition 
level. 


D. MODELS 


Generalized linear models can provide a good fit to these data sets. A descriptive 
model helps both the training team and the ship by producing estimated upper and lower 
bounds on average performance. Those ships that are expected to have very high 
performance may not need as much training; the training team might be able to dedicate 


more time to those ships with weaker performance. 


Applying the Poisson distribution, this study develops models to describe the 
number of RBOs, IOPs, and sum of RBOs and IOPs at the underway demonstration. 
Using the data, generalized linear models can use the number of discrepancies found at 
the initial assessment to estimate the number that will be observed at the underway 
demonstration. The fourth model also uses a generalized linear model to estimate the 
drill performance at the underway demonstration. This model uses the binomial 


distribution. 
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IV. DATA ANALYSIS 


A. ASSUMPTION 


The initial assessment and underway demonstration data reflect forty-four ship 
interdeployment training cycles. The data included seven Aegis Cruisers (CG), nine 
Aegis Destroyers (DDG), eight Frigates (FFG), and five Destroyers (DD). Fifteen of 
these ships completed two full cycles during the period under analysis. To determine 
whether the Type Commander can aggregate all the data to draw conclusions about the 
state of combatant engineering readiness, we test whether all four ship classes can be 


treated as one group, or whether the data need to be separated into individual ship classes. 


The common level of significance (alpha) is the 0.05 level. “The p-value (or 
observed significance level) is the smallest level of significance at which the null 
hypothesis would be rejected when a specified test procedure is used on a given data set” 
(Devore, page 342). When testing a hypothesis, if a p-value level less than 0.05 is 
observed then the null hypothesis is rejected in favor of the alternative hypothesis. For 


the purpose of this thesis, the alpha level will remain consistent at 0.05. 


Our analysis is on the complete set of 44 ship cycles from San Diego. However, 
we proceed as if our sample of 44 cycles constitute a random sample from a hypothetical 
population of West Coast ship cycles we might have seen over a different time period. 
All inference in this thesis refers to that hypothetical population. There is no evidence to 


question this approach. 


Here, the first null hypothesis is that all four ship classes have a common average 
readiness level when entering the initial assessment. The alternative hypothesis is that 
ship classes have different means. In order to test this hypothesis, we treat readiness as 
the response variable and ship class as the predictor. An analysis of variance (ANOVA) 
was performed on this model to get a p-value (Devore, 1995). The ANOVA assumes the 
populations are all normal with the same variance. The ANOVA models the four 
samples to determine if the null hypothesis is plausible. The p-value of 0.786 is much 
greater than the 0.05 alpha level. Although the data points come from binomial 
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proportions, the residuals were reviewed and seemed to follow the Normal distribution 
with no indication of heteroscedasticity. Therefore, the null hypothesis is not rejected; 
we conclude that the evidence fails to support different treatment of ships according to 


class. The data will be treated as a single group for this study, as well. Table 2 shows the 
































results. 
Degrees of Sum of Mean F-Value Pr(F) 
Freedom Square Square 
Type 3 0.029 0.009 0.354 0.786 
Residuals 40 1.104 0.027 
Table 2. Ship Class Analysis of Variance Table. 


B. TYPE COMMANDER OBJECTIVES 


Referring to the desired force-wide insights cited in Chapter III, this section 
analyzes the role of ship’s age and ship’s cycle trends, for both RBOs and IOPs and 
drills. 

1. Age of Ship 


Using the commission date of the ship and comparing that to the date of the 
assessment, we determined the age of each ship in months. The average age was 
determined to be 165 months, or just less than fourteen years. The data were also used to 
examine the relationship between age of the ship and the number of RBOs and IOPs at 


both the initial assessment and underway demonstration. 


The hypothesis is the same for both the initial assessment and the underway 
demonstration. The null hypothesis states that the age of the ship is not associated with 
the number of IOPs and RBOs. The alternative hypothesis is that there is an association 


between the number of IOPs and RBOs and ship’s age. 


For the initial assessment, a logistic regression was constructed. This is a very 
common model for data with a binary outcome (Hamilton, page 220). Let i index the ship 


cycles, so thatz = 1, ..., 44; let the number of drills for that cycle be denoted by n; and 
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the number effective by Y;. Write xia for the number of discrepancies in the i" cycle and 


write Xia for the ship’s age at that cycle. Then this logistic regression proposes 


Y, ~ Binomial(n,, p,) for i = 1, 2, ..., 44 and 


logit (pi) = log [2 = By + BXigt+ ByXiq - 
The logit function converts probability to log-odds. Although age is the primary variable 
of interest in this model, we adjust for the number of discrepancies. In this way we 
accommodate the common belief that older ships might have both more discrepancies 
and fewer “effective” drills. The coefficient f in the above equation represents the effect 
of age on the log-odds of a particular drill being effective, after adjusting for different 
numbers of discrepancies. The analysis of deviance chi-squared test statistics were 
calculated (McCullagh and Nelder, page 35). The associated p-values were 0.825 and 
0.211. Both p-values are greater than 0.05 and therefore the null hypotheses are not 
rejected. Drill performance is associated neither with the age of the ship nor the number 


of IOPs and RBOs. The complete Analysis of Deviance table is listed as Table 3. 





























Degrees of Freedom Deviance Pr(Chi) 
NULL 79.43 
Discrepancies at IA 1 84.99 0.211 
Age of Ship at IA 1 83.48 0.825 
Table 3. Analysis of Deviance Table for Logistic Regression Model of Drill 


Performance at IA. 


For the underway demonstration, another logistic regression model was 
constructed and the same test conducted. A p-value of 0.010 was determined for the 
number of discrepancies at the underway demonstration modeled against the drill 
performance. This indicates the null hypothesis of no association between drill 


effectiveness at UD and material discrepancies would be rejected. 


There is evidence, therefore of “good ships”, those in good material condition and 


watchstander performance. Age of ship is a different story. The p-value was determined 
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to be 0.245 for the age of the ship at the underway demonstration modeled against drill 
performance. This indicates the corresponding null hypothesis of no association between 
drill effectiveness at UD, adjusting for material discrepancy, is not rejected. The 
corresponding conclusion is that the number of material discrepancies is associated with 
drill performance, but, adjusting for the number of material discrepancies, the age of the 
ship has no association with the performance of drills at the underway demonstration. 


The complete Analysis of Deviance table is listed in Table 4. 
































Degrees of Freedom Deviance Pr(Chi) 
NULL 79.42 
Discrepancies at UD 1 85.92 0.010 
Age of Ship at UD 1 80.76 0.245 
Table 4. Analysis of Deviance Table for Logistic Regression Model of Drill 


Performance at UD. 


2. Ship Cycle Trends 


The data included fifteen ships that repeated the interdeployment training cycle 
twice. This allows the opportunity to compare a ship’s performance across the two 
cycles. In this section we assume our sample of fifteen ships with two cycles each is like 
a random sample from a population of ships we might have seen twice, and our inference 
refers to that population. 


a. Drills 


There were four different ways to look at the performance of the drills. 
These were the same as the methods for testing the RBOs and IOPs. The null hypothesis 
is that there is no change in performance from one assessment to the other assessment. 
The alternative hypothesis is there is a decrease in the performance between the 


assessments. 


First we compare the performance of drill effectiveness from the initial 
assessment to the underway demonstration for both the first and second cycle. In the first 


cycle, a sign test was conducted with a p-value of 1. In the second cycle, another sign 
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test produced a p-value of 1. Each of these results leads to the corresponding hypothesis 
not being rejected. We conclude that from the initial assessment to the underway 


demonstration that ships’ performance improves. This result was expected. 


Figure 1 shows the relationship between the initial assessment and the 
underway demonstration during the second cycle. The x-axis shows the proportion of 
effective drills performed by a ship during the initial assessment and the y-axis shows the 
proportion that the same ship had during the underway demonstration. In all cases, the 
performance at the underway demonstration was an improvement from the initial 


assessment. 


lA vs. UD 





UD Drill Percentage 








0.00 0.05 0.10 0.15 0.20 0.25 
IA Drill Percentage 


Figure 1. IA vs. UD during the second cycle. 


The next cases examined the performance of the ship from the first cycle 
to the second cycle during the underway demonstration. Here there was no significant 
difference and the null hypothesis stating there was no change in performance from the 


underway demonstration from either cycle cannot be rejected. 
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The final case produced the most interesting result. Here the ships were 
compared between the first initial assessment and the second initial assessment. The sign 
test had a p-value of 0.0112. This result leads to the null hypothesis being rejected and to 
the conclusion that there is a significant difference in performance between the first 


initial assessment and the second initial assessment. 


Further analysis showed that ten of the fifteen ships had actually done 
worse on the second initial assessment than they had done on the first initial assessment. 
This result suggests that ships are not as well prepared for the more recent initial 
assessments and do not perform as well as they had in the past. Table 5 shows the results 


for the drill performance. 





Greater Values Total Non-Ties P-Value 














First Initial Assessment to the Second Initial Assessment 





Drills 10 13 0.0112 





First Underway Demonstration to Second Underway Demonstration 





Drills 6 14 0.6047 














First Initial Assessment to First Underway Demonstration 





Drills 0 15 1 














Second Initial Assessment to Second Underway Demonstration 








Drills 0 15 1 














Table 5. Sign Test for Drills. 


b. RBO and IOP 


There are three ways to analyze the material discrepancies. They are 
comparing RBOs from the first cycle to the second cycle, comparing the IOPs from the 
first cycle to the second, and summing the RBOs and IOPs for comparison. There are 


four different ways of testing the data, including: 


e Comparing each ship’s results on the first initial assessment to those 
on the second initial assessment; 
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e Comparing each ship’s results on the first underway demonstration to 


those on the second underway demonstration; 


e Comparing each ship’s results on the first initial assessment to those 


on the first underway demonstration; 


e Comparing each ship’s results on the second initial assessment to those 


on the second underway demonstration. 


The null hypothesis is that there is no change in the average number of 
RBOs and IOPs from one ship cycle to the next. The alternative hypothesis was there is a 
decrease from the first cycle to the second ship cycle. Since pairs of measurements on 
different ships are presumably independent, the differences within the pairs are 


independent from one pair to the next. 


The method chosen for testing is the sign test since that test requires no 
assumptions about the distribution of the set of differences. The sign test acts on 
independent pairs taken randomly from a bivariate distribution whose values are at least 
ordinal (Conover, page 157), so this test is appropriate here if ship cycles are mutually 
independent. Due to shipboard crew turnover rates, this is a reasonable assumption. The 


null hypothesis states that either member of a pair is equally likely to be the larger. 


For two of the four different ways of testing the data, the p-values were 
above the 0.05 level. This meant there was no significant change in the average number 
of RBOs and IOPs from the initial assessment first cycle to the second cycle and 


underway demonstration first cycle to the second cycle. 


However, when comparing the first initial assessment to the first 
underway demonstration, there were results that showed a significant difference. For the 
first cycle, the RBOs had a paired sign test that produced a p-value of 0.0107. Also, 
when the sum of the RBOs and IOPs were compared, a paired sign test had a p-value of 
0.0112. The sign test shows that the null hypothesis was rejected. We conclude there is a 
decrease both in the number of RBOs and in the total of RBOs and IOPs between the first 
initial assessment to the first underway demonstration. Table 6 shows the results for the 


material discrepancies. 
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Greater Values Total Non-Ties P-Value 

First Initial Assessment to the Second Initial Assessment 
RBO 4 13 0.8665 
IOP 4 13 0.8665 
SUM 4 13 0.8665 

First Underway Demonstration to Second Underway Demonstration 

RBO 5 10 0.3769 
IOP 3 10 0.8281 
SUM 3 10 0.8281 

First Initial Assessment to First Underway Demonstration 
RBO 8 10 0.0107 
IOP a ih 0.1132 
SUM 10 13 0.0112 

Second Initial Assessment to Second Underway Demonstration 

RBO 10 12 0.0031 
IOP 8 11 0.0327 
SUM 11 13 0.0017 














Table 6. 


Sign Test for Material Discrepancies. 





Again, when testing comparing the second initial assessment to the second 
underway demonstration, a significant result was observed. For the RBOs, the paired 
sign test produced a p-value of 0.0031. For the IOPs, the paired sign test produced a p- 
value of 0.0327. When the sum of RBOs and IOPs were compared, the p-value was 
0.0017. These results show that the null hypothesis was rejected, leading to the 


conclusion that there was a significant decrease in the number of material discrepancies 


from the second initial assessment to the second underway demonstration. 
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These results suggest that in the first cycle and the second cycle that ships 
had completed, there was generally an improvement in clearing the material 
discrepancies prior to the underway demonstration. 


C. DESCRIPTION OF THE DATA 


There were four major regressions that were modeled to describe the data 
appearance. The models help in estimating the performance of the ship, but they are just 
models with error. 


1. RBO Model 


We used all forty-four ships in the construction of the model. The model is the 
usual Poisson regression model (McCullagh and Nelder, page 194). The number of RBOs 
at the UD is assumed to follow a Poisson distribution with parameter A; for ship i = 1, 2, 
... 44; each ship has its own parameter which depends on the number of RBOs at the IA, 
denoted by x74. The model proposes that 


RBO(UD), ~ Poisson(A,) for i = 1, 2, ..., 44 
log(A,) = £,+ 4x, with estimates B, =—0.3215, 8 =0.1227 
Expected number of RBOs at the UD =e°*7?*°!”7" 


For example, if a ship had three RBOs at the initial assessment, then x = 3, and 
the model claims that the number of RBOs at the UD would be a Poisson random 


variable with parameter 1.05, meaning we would expect to see one or two RBOs at the 
UD. 
2. IOP Model 


All the data points were used to construct the model. The model is the following: 


RBO, ~ Poisson(A,) for i = 1, 2, ..., 44 
log(A,) = 8, + 4x, with estimates 8, = 0.1192, 8, =0.0517 


Expected number of IOPs at the UD = e°117*°!* 
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For example, if a ship had three IOPs at the initial assessment, then x = 3, and one 
would expect, on average, 1.32 IOPs at the underway demonstration. 


3. Sum of RBO and IOP Model 


Another model was constructed using the total of RBOs and IOPs to estimate the 
total number that would be expected at the underway demonstration. The model is the 


following: 


RBO, ~ Poisson(A,) for i = 1, 2, ..., 44 
log(/A,) = £,+ 2.x, with estimates Z, = 0.4643, 8, = 0.0714 


Expected number of RBOs and IOPs at the UD = e°***°°7** 


Using the same example, if there were a total number of six RBOs and IOPs at the 
initial assessment, there would be, on average, 2.44 total discrepancies expected at the 


underway demonstration. 


If one sums up the individual predictions for 3 RBOs and 3 IOPs, an expected 
total number of discrepancies are 2.37, not the 2.44 predicted by the overall model. 
However, the two different results are close. 


4. Drill Model 


The final model that was constructed involved the proportion of effective drills at 
the underway demonstration as a function of the proportion of effective drills at the initial 
assessment. As previously discussed, this is a logistic regression model with the logit 
function. All of the data points were used in constructing the model. The model for 
estimating the percentage of effectiveness on drills at the underway demonstration is the 


following: 


Drill, ~ Binomial(n,, p,) for i= 1, 2, ..., 44 and n; = number of drills conducted 


log [2 = 8, + 8x, with estimates £, = 0.3117, 8, =1.723 


1 
Expected Proportion of Effective Drills at the UD = 1p SST TER) 
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For example, if a ship earned a score of 25% on the drills at the initial assessment, 
the expected score at the underway demonstration would be 67.75%. When constructing 
the model, there was one major point that was discovered. If a ship scored 0% on the 
initial assessment, the underway demonstration score was predicted to be 57.73%. This 
is higher than the 50% threshold that is currently in place. Therefore, the model says that 
on average, ships with a 0% at the initial assessment will pass with greater than 50% of 
drills effective at the underway demonstration. Although in a set of five or ten drills, any 
particular ship might record a score that is less than the 50%, the system of assessment, 
training, and then demonstration yields an expectation of satisfactorily completing the 
readiness cycle. 


D. PROGRAMS 


As previously listed, there are sixteen different administrative programs that get 
reviewed. As before, a Poisson model was constructed to correlate programs and the 
number of RBOs and IOPs, since these are integer-valued. A model was built using the 
Akaike Information Criterion (AIC) in a stepwise regression. In the stepwise regression, 
the predictors that produce the largest decrease are removed from the model, eventually 
determining the most efficient model (Hamilton, 1992). The complete Analysis of 


Deviance table is listed in Table 7. 



































Degrees of Freedom Deviance Pr(Chi) 
NULL 91.91 
Bearing Records 1 115.88 0 
Legal Records 1 111.07 0 
Quality Assurance 1 101.33 0.002 
Operating Logs 1 97.00 0.024 
Table 7. Programs at the Initial Assessment Analysis of Deviance Table. 


The model does not produce an estimate of how well a ship will perform; 
however, it does show which ships do well by looking at four particular programs: 


bearing records, legal records, quality assurance, and operating logs. 
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For example, there were ten ships out of the forty-four ships that scored “not 
effective” in three of the four programs. Six of these ten ships did not have an effective 
drill at the initial assessment. There were a total of four ships that had more than ten 
IOPs and RBOs at the initial assessment. Of these four ships, two of them had scored 
“not effective” in three of these programs. Of the two ships that had ten or more IOPs 
and RBOs at the underway demonstration, both had scored “not effective” in three of the 


four programs. 


On the other hand, there were seven ships that had scored effective in three of the 
four programs. Three of these seven ships received a score of 80% or higher in drills at 
the underway demonstration, and five of these seven received a score greater than 62.5%. 
Also, the largest number of IOPs and RBOs observed at the underway demonstration by 
one of the seven ships was four, where four ships out of the seven had no IOPs or RBOs 


at the underway demonstration. 


The programs do not model the performance of the ship at the underway 
demonstration; rather the model suggests that a ship that does well at the underway 
demonstration performed well in these four management programs, again lending 
credence to the model of a “good ship,” one whose performance and administrative 
management are above average. 


E. CONTROL CHARTS 


The most important approximation explaining a population is the normal 
distribution. Many numerical populations have distributions that can be fit with the 
normal curve. If one can assume that all ships’ drill performances are fit by the normal 
approximation, then it would be reasonable to designate ships with performance more 
than one standard deviation below the mean as “below average” and those more than one 
standard deviation above the mean as “above average”. 


1. Drills at the Initial Assessment 


The average drill performance at the initial assessment since 1999 has been 
13.55%. The standard deviation was 16.23%. Going one standard deviation below the 


mean places the “below average” mark below a score of zero. This suggests that the 
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normal distribution is not a good approximation to the data. Figure 2 is a graphical 


representation of the data. 


IA Drill Scores vs. Year 





Drill Percentage 








1999 2000 2001 2002 2003 


Year of Assessment 


Figure 2. IA Effective Drill Scores compared to the year of the Assessment. 


The best fit models for drills used the binomial distribution; interval estimation is 
a reasonable inference regarding a random sample (Conover, page 129). A good range 
for estimating average ships is ships within the middle 80%. Then ships above the 90" 
percentile would be designated “above average” and those below the 10" percentile 
would be “below average.” These ranges depend on the number of drills attempted. 
Since 13% of drills were completed successfully at the initial assessment, we compute the 
10" and 90" percentiles of the binomial distribution with n trials and probability 0.13 for 
eachn=4,5,..., 16. 


When three or fewer drills are attempted, it is very difficult to determine the level 
at the initial assessment. However, if four or more drills are attempted, one can 


determine the level of the ship by this scheme. For example, if a ship attempts fourteen 
a7 


drills and four drills are assessed as “effective”, that ship would receive an “above 
average” standard for drills at the initial assessment. Table 7 depicts the range for 
determining whether a ship is below average, average, or above average based on the 


number of drills attempted and the number assessed as effective. 







































































Number of Drills Assessed as Effective 
Number of Drills Below Average Average Above Average 
Attempted 
1 0 NA 1 
2 0 NA 1 
3 0 NA 1 
4 0 1 2 
5 0 1 2 
6 0 1 2 
i 0 1 2 
8 0 1,2 3 
9 0 1,2 3 
10 0 1,2 3 
11 0 1,2 3 
12 0 1-3 4 
13 1 2,3 4 
14 1 23 4 
15 1 2,3 4 
16 1 2-4 ®) 
Table 8. Range for Effective Drills at the Initial Assessment. 
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2. Drills at the Underway Demonstration 


Similar results were found with the data at the underway demonstration. Since 
1999, the mean average has been 63.03% with a standard deviation of 19.71%. If the 
data is believed to be normally distributed, one standard deviation on both sides of the 
mean represent 68.34% of the population. This range for an average ship would fall from 
43.32% to 82.75%. The lower level falls below the 50% threshold for passing an 


underway demonstration. Figure 3 is a graphical representation of the data. 


UD Drill Scores vs. Year 





Drill Percentage 








1999 2000 2001 2002 2003 
Year of Assessment 
Figure 3. UD Effective Drill Scores compared to the year of the Assessment. 


Here again, the normal distribution is not a good approximation for the 
distribution. As with the initial assessment, using the binomial distribution allows for 
determining a percentile range based on the number of drills attempted. Using the same 
standards as in the initial assessment, the 90" quantile would delineate the “above 


average” ships and the 10" quantile would determine the “below average” ships. For 
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example, if a ship attempts fourteen drills and seven drills are assessed as “effective”, that 
ship would receive a “below average” standard for drills at the underway demonstration. 
However, seven out of fourteen drills is 50%. The ship would pass the underway 
demonstration, with a below average standard in drill effectiveness. Table 8 is the range 
for determining whether a ship is below average, average, or above average based on the 


number of drills attempted and the number assessed as effective. 







































































Number of Drills Assessed as Effective 
Number of Drills Below Average Average Above Average 
Attempted 
1 0 NA 1 
2 0 1 2 
3 1 2 3 
4 1 2,3 4 
5 2 3,4 5 
6 2 3,4 > 
7 i 4,5 6 
8 3 4-6 fi; 
9 4 5-7 8 
10 4 5-7 8 
4. 5 6-8 9 
{2 5 6-9 10 
13 6 7-9 10 
14 7 8-10 id 
15 7 8-11 12 
16 8 9-12 13 
Table 9. Range for Effective Drills at the Underway Demonstration. 


As another example, if a ship attempts twelve drills and six are assessed as 


effective, that ship would receive an “average” standard for drills at the underway 
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demonstration. Each of the ships in the examples received a 50% in performance, yet 
one ship would be “average” and the other “below average” in readiness. 


3. IOPs and RBOs 


A similar approach was used for the IOPs and RBOs. The average number of 
discrepancies at the initial assessment was a total of 4.409 IOPs and RBOs. During the 
underway demonstration, the average was 2.295 discrepancies. This was an 
improvement of more than two cleared discrepancies on average from the initial 


assessment to the underway demonstration. 


In this case, since IOPs and RBOs are integers, we model their occurrence as 
events from a Poisson process with mean 4.409 discrepancies per ship at the initial 
assessment and a mean 2.295 discrepancies per ship at the underway demonstration. At 
the initial assessment, the 10” and 90" percentiles of this distribution are 2 and 7, 


respectfully. Figure 4 shows the data together with these percentiles. 


IA RBO and IOP vs. Year 








Sum of RBO and IOP 











1999 2000 2001 2002 2003 


Year of Assessment 


Figure 4. Number of RBOs and IOPs at the IA compared to the year of the 


Assessment. 
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Supporting the theory that ships are doing more poorly in recent years than they 


did in 1999 and 2000, the relative number of ships in the below average category is 


increasing. Table 10 shows the numbers of ships in each category. 












































Year Number Seven or More Two or Fewer Percent Percent 

of Ships Discrepancies Discrepancies Below Above 

(“Below Average”) | (“Above Average”) | Average | Average 
2003 2 1 1 50% 50% 
2002 15 a 6 33% 40% 
2001 13 1 6 7% 46% 
2000 11 2 3 18% 27% 
1999 3 0 p! 0% 67% 
Table 10. | Number of Discrepancies at the IA per Year by Percentile Category. 


From Section B, there was a discussion about the initial assessment with the 


number of IOPs and RBOs. The p-values were 0.0112 and 0.0017. This suggested that 


the null hypothesis was rejected. The percentage of below average ships was lower in 


1999 and 2000 than in recent years. This suggests that the initial assessment performance 


in material discrepancies has deteriorated in the past few years. 


At the underway demonstration, the 10" and 90" percentiles of this distribution 
y p 


are 0 and 4, respectfully. Table 11 shows the numbers of ships in each category. 






































Year Number Four or More No Discrepancies Percent Percent 
of Ships Discrepancies (“Above Average”) | Below Above 

(“Below Average’’) Average | Average 
2003 4 1 3 25% 75% 
2002 14 3 5 21% 36% 
2001 13 i 3 54% 23% 
2000 11 0 5 0% 45% 
1999 2 0 1 0% 50% 
Table 11. | Number of Discrepancies at the UD per Year by Percentile Category. 


There were no ships that were in the “below average” category in 1999 or 2000. This 


backs the rejection of the null hypothesis from Section B, stating that the material 
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condition of the ships has been on the decrease over the past few years. Figure 5 shows 


the data together with the percentiles. 


UD RBO and IOP vs. Year 





Sum of RBo and IOP 














1999 2000 2001 2002 2003 
Year of Assessment 


Figure 5. Number of RBOs and IOPs at the UD by year of the Assessment. 
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V. CONCLUSION AND RECOMMENDATION 


The purpose of this thesis is to explore the idea of engineering readiness and 
training on board United States Navy surface combatants. The analysis produces many 


results that were expected and a few that were not. 


The first important result is that the age of a ship has no association with 
performance of drills and that the number of discrepancies has a direct bearing on the 
performance of drills. The number of Items of Priority (IOP) and Repair Before 
Operating (RBO) discrepancies does not increase as the age of ship increases. The best 
explanation for this result is that the age of equipment is independent of the age of the 
ship. Through the years, preventative maintenance occurs on the equipment, effectively 
re-setting the life of the equipment to zero. By virtue of material discrepancies being 
cleared, ships experience better performance in drills. To achieve good watchstander 


performance, material condition of the ship needs to be maintained at a high level. 


Another result shows that from the first initial assessment to the second initial 
assessment, drill performance decreased. Ships performed better in 1999 and 2000 than 
they did in 2001 and 2002. Either ships are not preparing for the initial assessment as 
well as they did in the past, ATG is observing drills more critically, or turn-around from 
deployment to the initial assessment does not permit as much preparation as in the past. 
Another explanation could be the number of material discrepancies are on the increase, in 
recent years. As stated previously, the increase of discrepancies could best explain the 
decreased scores in the drill performance. Any of these explanations need further 


analysis to determine the actual cause. 


The third result shows that from the initial assessment to the underway 
demonstration, on average, the number of material discrepancies decreases. This result 
could be explained as the engineering department goes through the training cycle, 
corrective maintenance on the equipment is completed. Training performance and 
material condition are complementary. This too may help explain why the age of the ship 


does not influence the number of discrepancies. 
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The fourth result shows that there are four programs that good ships do well on. 
These programs were bearing records, legal records, quality assurance, and operating 
logs. The results demonstrated that a ship that performs well during the underway 
demonstration exhibits common feature of good program management. The attention to 
detail suggested by this characteristic bears out in other components of shipboard 


readiness. 


Control charts depict upper and lower limits on the number of material 
discrepancies and drill proficiency. The charts were produced to determine a level for 
identifying above average, below average, and average shipboard engineering readiness. 
Standards were developed for both the initial assessment and underway demonstration. 
As the engineering department trains for the assessments, the control charts may be a 
helpful tool for the type commander or ATG to gauge where a specific department 


compares to other ships’ engineering departments. 


The final and most important result was that training is effective. In all cases, 
ships did the same or better on the underway demonstration than they did on the initial 
assessment. This result shows the training process, as currently recorded, is working and 


that the process should continue. 


A comparative study of East Coast ships would be valuable further analysis. Such 
a study might show whether both coasts have similar levels of training. It would also 
help in determining whether ships on the East Coast are more prepared than those on the 


West Coast, and identify Navy-wide, not just fleet-wide, trends. 


Another comparative study would be collection and analysis of data for 
amphibious ships. This thesis only explored combatant ships on the West Coast. A 
follow-on study might show whether both types of ships have similar levels of readiness. 
With the combination of analysis of both coasts and combatant and amphibious ships, 


COMNAVSURFOR could determine the state of all surface ship engineering readiness. 


It is recommended that data be collected and models updated every year. The 
estimate models might be more of an asset as more ships are pooled into the group. As 


well as, the drill scores can continue to be plotted comparing the year of the assessment 
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to determine if the downward trend is continuing, or if ships will improve. This can 


prove to be a quality management tool for shipboard engineering readiness. 
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APPENDIX A: STANDARD EQUIPMENT LIST 


CG/DD DDG FFG 





Equipment | Required | On-Board | Required | On-Board | Required | On-Board 





GTM} 2" on 





GTG/SSDG 





MRG 





CRP/CPP 





WHB 





HPAC 





LPAC 





LO Strainer 





FO Strainer 





FO Transfer 





Firepump 





SWS Pump 























Note: 

1. One per shaft. Two engines on each of two shafts. 

2. One per operational GTG. 

3. Sufficient fire pumps should be in commission such to set Zebra 
(Instruction 3540.9, Chapter 3, p3-A-2-1) 
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APPENDIX B: INDEPENDENT VARIABLES 
VARIABLE NAME _ | VARIABLE DESCRIPTION 
Ia IA % Effective The percent effective for each ship at 


the IA 




















ia.age Age of ship The age of the ship, in months, during 
the IA. 

ia.big IA data The complete set of data points for all 
ships at the JA 

Ud UD % Effective The percent effective for each ship at 
the UD 

ud.age Age of ship The age of the ship, in months, during 
the UD. 

ud. big UD data The complete set of data points for all 








ships at the UD 
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